The purpose of this analysis is to compare as many as possible tools and methods that we have in our disposal in order to classify the synergies observed in the dataset that was published by (Flobak et al. 2019). Note that throughout this analysis the term SINTEF dataset will refer to the first screen dataset from that paper.
By methods we refer to mathematical models like Bliss and HSA (Highest Single Agent) used for the synergy classification as well as to the data processing methodology for identification and extraction of the synergies from the dataset.
Currently, this analysis is mostly based on the use of the rbbt tool with its specified workflows and tasks for finding the observed synergies of the SINTEF dataset using various parameters as input. Data results from (Flobak et al. 2019) will also be assessed.
ConPropExcess results (F1-Scores) are higher for Bliss than for HSA, compared to the other 2 tasks where there is no statistically significant difference.Avg < ConExcess < ConPropExcess), the higher the correlation between Bliss and HSA.Avg and ConExcess tasks, HSA has a larger percentage of F1-Scores (∼5% and ∼9% more than Bliss respectively).ConPropExcess task, Bliss has a larger percentage of F1-Scores (∼13% more compared to HSA).y=x line: no matter the cell line or task, there are always some Bliss-related methods that produce higher F1-Scores than the corresponding HSA ones.ConExcess method (using HSA).
AGS cell line we have a perfect synergy assement: the best task results match exactly the same synergies as the curated Gold-Standard for AGS!p_value_bonferroni and mean_bliss_threshold) for the Bliss-all-combinations method (from (Flobak et al. 2019)) that result in the highest average F1-Score across all 8 cell lines, still perform lower than the highest ones attained using the rbbt tasks.
Loading libraries and helping functions:
library(usefun)
library(dplyr)
library(tidyr)
library(readr)
library(tibble)
library(plotly)
library(DT)
library(ggpubr)
library(RColorBrewer)
source(file = "fun.R")
The rbbt bioinformatics toolkit has several workflows and the most interesting for our investigation are the SINTEF and CombinationIndex - the later published as a standalone in (Flobak et al. 2017).
So, using the rbbt tool and the SINTEF workflow, we are going to computationally define the synergies in the SINTEF dataset (8 cell lines) and check which methods/tasks (and input parameters) come closer to our gold-curation standard.
The SINTEF workflow provides several tasks to calculate the observed synergies for the cell lines of the SINTEF data paper and all these tasks use internally the CombinationIndex tasks bliss and hsa. The last two tasks take a lot of time to calculate - around \(1\) hour per cell line - but they are independent in the sense that the results do not change (they depend only on the SINTEF dataset).
So the data from these two tasks needs to be pre-calculated before we proceed further with the analysis. We provide either the data pre-calculated (and a way to extract it for use in the analysis) or a way to calculate it yourself (DIY):
Move the two archived files to ~/.rbbt/var/jobs/SINTEF (where the job files are stored) and run:
tar xzvf sintef.combination.index.bliss.tar.gz
tar xzvf sintef.combination.index.hsa.tar.gz
Thus, two directories named bliss and hsa, will be created (with the respective results inside).
bliss and hsa task results from the SINTEF dataset (DIY way)Given an rbbt.SINTEF.image (ask the author for it), and a SINTEF workflow at least at commit 947307b, you can use the run_cimbinator.sh script to produce the data results.
In December 2019, we renamed some of the tasks in the SINTEF workflow in order to better understand what these methods actually do and to also add some new ones. We now have the following methods (rbbt tasks) for calculating synergy scores in the SINTEF dataset (the SINTEF workflow has to be at least on commit 2c5ea):
rbbt workflow task SINTEF synergy_classification_by_GS -h... --cell_line=A498 | grep -v SFrbbt workflow task SINTEF synergy_classification_by_avg -h... --ci_method=hsa --excess_threshold=-0.10 --cell_line=AGS | grep -v SFrbbt workflow task SINTEF synergy_classification_by_consecutive_excesses -h... --ci_method=hsa --consecutive_excess_count=2 --excess_threshold=-0.15 --cell_line=AGSrbbt workflow task SINTEF synergy_classification_by_consecutive_proportional_excesses -h... --ci_method=bliss --consecutive_excess_count=2 --excess_proportional_threshold=0.2 --cell_line=AGSexcess_threshold.consecutive_excess_count amount of consecutive points of the non-interaction curve (Bliss or HSA) that are lower than the corresponding points of the combined response curve by a value/distance of excess_threshold.consecutive_excess_count amount of consecutive points of the non-interaction curve (Bliss or HSA) that are lower than the corresponding points of the combined response curve by a threshold value that is determined from the proportion (excess_proportional_threshold) of the excess relative to the magnitude of the combined response.So, for each respective cell line we get the gold-curated standard (GS) synergies (already pre-calculated using this script):
gs.list = readRDS(file = "gs_list.rds")
And these are:
cell.lines = c("A498", "AGS", "Colo205", "DU145", "MDA-MB-468", "SF295", "SW620", "UACC62")
print.gs.synergies = function() {
pretty_print_string("")
for (cell.line in cell.lines) {
pretty_print_bold_string(string = cell.line, with.gt = FALSE)
print_empty_line(html.output = TRUE)
pretty_print_vector_values(gs.list[[cell.line]], with.gt = FALSE, vector.values.str = "synergies")
print_empty_line(html.output = TRUE)
print_empty_line(html.output = TRUE)
}
}
print.gs.synergies()
A498
10 synergies: BI-PI, AK-BI, PD-G2, AK-PD, PI-D1, BI-PD, AK-G4, PD-P5, AK-60, 5Z-G2
AGS
6 synergies: AK-BI, PI-D1, BI-D1, PI-G2, PD-PI, 5Z-PI
Colo205
3 synergies: AK-BI, AK-60, BI-PI
DU145
1 synergie: AK-BI
MDA-MB-468
10 synergies: AK-BI, BI-PI, AK-PI, PI-JN, PI-D1, PI-D4, CT-PI, PD-PI, AK-PD, PD-JN
SF295
3 synergies: AK-BI, PI-JN, PI-P5
SW620
3 synergies: AK-BI, PI-JN, PI-G2
UACC62
5 synergies: AK-BI, BI-PI, PI-D1, PI-G2, CT-PI
As we saw above, for every cell line we have a set of GS-synergies. Also, every SINTEF task defined - referred for simplicity here as a method - given different parameters, will produce a different set of synergies where some of them will be in the respective GS set and some maybe not.
We want to define a matching score that will tell us how close the calculated synergy set of a method is to the respective GS one. Some definitions first:
So, it must be pretty obvious that our hits-and-misses score should try to maximize the \(hits\) and minimize the \(misses\), which mathematically can be expressed by this equation: \(max(hits-misses)\). Observe that the extreme values for this difference are (\(max\) case where the sets compared are exactly the same: \(mset=gset\)):
\[max_{hm}=max(hits-misses)=l_{gset}-0=l_{gset}\]
and in the case where \(mset \cap gset=\emptyset\):
\[min_{hm}=min(hits-misses)=0-max(misses)=-l_{mset}\]
So, given the 2 sets \(gset\) and \(mset\), with some \(hits\) and \(misses\) as defined above, we produce the normalized \(score_{hm}\in[0,1]\) of how well the method’s produced set matches the gold-standard set as:
\[score_{hm}=\frac{(hits-misses)-min_{hm}}{max_{hm}-min_{hm}}=\frac{hits-misses+l_{mset}}{l_{gset}+l_{mset}}\]
# for the examples
gset = c("a","b","c")
mset1 = c("a","b","c")
mset2 = c("b","c","d")
mset3 = c("c","d","e")
mset4 = c("b","c","e","f","g","j","o")
mset5 = c("c","d","e","f","g","j","o")
mset6 = c("d","e")
Consider the following examples to understand how the formula works (capital letters represent different drug combinations):
For the advanced reader, note that the hits-and-misses \(score_{hm}\), is actually the F1-score (!), since considering that \(hits=TP\), \(misses=FP\), \(l_{mset}=TP+FP\) and \(l_{gset}=P=TP+FN\), we have that: \[score_{hm}=\frac{hits-misses+l_{mset}}{l_{gset}+l_{mset}}=\frac{TP-FP+TP+FP}{TP+FN+TP+FP}=\frac{2\times TP}{2\times TP+FN+FP}=F_1\]
To perform a complete search on the methods parameter space and derive a score for each method (with a specific parameter combination in each case), run this script. We have already run it for efficiency and now simply load the result object to continue with the analysis:
res.data = as_tibble(readRDS(file = "res_data.rds"))
So, the results consist of the F1-scores for every possible combination of:
and parameters (applicable to specific tasks):
Let’s compare from the raw data (so not matching them exactly parameter-to-parameter) the two mathematical models that were used in the calculation of synergies. First, we will do that across all possible tasks, cell lines and parameters:
ggboxplot(data = res.data, x = "ci_method", y = "f1.score",
title = "Bliss vs HSA across all cell lines, tasks and parameters",
xlab = "Synergy Method", ylab = "F1-Score", color = "ci_method",
add = "jitter", palette = "Set1", shape = "ci_method") +
stat_compare_means(label.x = 1.35)
Seems that there is a small difference in favor of the bliss-related tasks
Let’s check if the same trend can be seen when we split the raw data per cell line:
ggboxplot(data = res.data, x = "cell_line", y = "f1.score", fill = "ci_method",
title = "Bliss vs HSA per cell line across all tasks and parameters",
xlab = "Cell lines", ylab = "F1-Score", palette = "Set1") +
stat_compare_means(aes(group = ci_method), label = "p.signif") +
theme(axis.text.x = element_text(size = 11))
F1-Score == 1 for some cell lines).ggboxplot(data = res.data, x = "task", y = "f1.score", fill = "ci_method",
title = "Bliss vs HSA per SINTEF task, across all cell lines and parameters",
xlab = "Tasks", ylab = "F1-Score", palette = "Set1") +
stat_compare_means(aes(group = ci_method), label = "p.signif")
ConPropExcess task for any cell line that produce synergies matching exactly the curated standard synergies in that cell line (F1-Score equal to \(1\)).Avg and ConExcess tasks.ConPropExcess task.We will now compare Bliss vs HSA synergy methods for each different task in separate graphs. This will allow us to observe if in each task there is a small Bliss advantage over HSA, or if it actually depends more on each cell line:
ggboxplot(data = res.data %>% filter(task == "Avg"), x = "cell_line", y = "f1.score",
fill = "ci_method", title = "Bliss vs HSA per cell line (Task: Avg)",
xlab = "Cell lines", ylab = "F1-Score", palette = "Set1") +
stat_compare_means(aes(group = ci_method), label = "p.signif") +
theme(axis.text.x = element_text(size = 11))
ggboxplot(data = res.data %>% filter(task == "ConExcess"), x = "cell_line", y = "f1.score",
fill = "ci_method", title = "Bliss vs HSA per cell line (Task: ConExcess)",
xlab = "Cell lines", ylab = "F1-Score", palette = "Set1") +
stat_compare_means(aes(group = ci_method), label = "p.signif") +
theme(axis.text.x = element_text(size = 11))
ggboxplot(data = res.data %>% filter(task == "ConPropExcess"), x = "cell_line", y = "f1.score",
fill = "ci_method", title = "Bliss vs HSA per cell line (Task: ConPropExcess)",
xlab = "Cell lines", ylab = "F1-Score", palette = "Set1") +
stat_compare_means(aes(group = ci_method), label = "p.signif") +
theme(axis.text.x = element_text(size = 11))
Now we tidy our raw data, in order to compare homogenous data points - i.e. coming from the same task, parameters and cell line, only different synergy method (Bliss vs HSA).
First for the Avg task:
data.task.avg = res.data %>%
filter(task == "Avg") %>%
group_by(ci_method) %>%
arrange(ci_method, excess_threshold, cell_line) %>%
transmute(cell_line, excess_threshold, f1.score) %>%
group_split() %>%
bind_cols()
# count Bliss vs HSA F1-Scores
task.avg.counts = data.task.avg %>%
transmute(equal = (f1.score == f1.score1), less = (f1.score < f1.score1), larger = (f1.score > f1.score1)) %>%
summarise(equal = sum(equal), less = sum(less), larger = sum(larger))
ggscatter(data = data.task.avg, x = "f1.score", y = "f1.score1", color = "cell_line",
xlab = "Bliss F1-Scores", ylab = "HSA F1-Scores",
title = "'Avg' task (Kendall Correlation)",
add = "reg.line", add.params = list(color = "blue", fill = "lightgray"),
conf.int = TRUE, cor.coef = TRUE,
cor.coeff.args = list(method = "kendall", label.x.npc = 0.7, label.y.npc = 0.1)) +
geom_segment(x = 0, y = 0, xend = 1, yend = 1, size = 0.3, linetype = "dashed") +
annotate(geom = "text", label = "y = x", x = 0.93, y = 1) +
annotate(geom = "text", label = paste0("Bliss < HSA: ", signif(100*task.avg.counts$less/sum(task.avg.counts), digits = 2), "%"), x = 0.15, y = 1) +
annotate(geom = "text", label = paste0("Bliss = HSA: ", signif(100*task.avg.counts$equal/sum(task.avg.counts), digits = 2), "%"), x = 0.15, y = 0.9) +
annotate(geom = "text", label = paste0("Bliss > HSA: ", signif(100*task.avg.counts$larger/sum(task.avg.counts), digits = 2), "%"), x = 0.15, y = 0.8)
Next for the ConExcess task:
data.task.con.excess = res.data %>%
filter(task == "ConExcess") %>%
group_by(ci_method) %>%
arrange(ci_method, consecutive_excess_count, excess_threshold, cell_line) %>%
transmute(consecutive_excess_count, excess_threshold, cell_line, f1.score) %>%
group_split() %>%
bind_cols()
# count Bliss vs HSA F1-Scores
task.con.excess.counts = data.task.con.excess %>%
transmute(equal = (f1.score == f1.score1), less = (f1.score < f1.score1), larger = (f1.score > f1.score1)) %>%
summarise(equal = sum(equal), less = sum(less), larger = sum(larger))
ggscatter(data = data.task.con.excess, x = "f1.score", y = "f1.score1", color = "cell_line",
xlab = "Bliss F1-Scores", ylab = "HSA F1-Scores",
title = "'ConExcess' task (Kendall Correlation)",
add = "reg.line", add.params = list(color = "blue", fill = "lightgray"),
conf.int = TRUE, cor.coef = TRUE,
cor.coeff.args = list(method = "kendall", label.x.npc = 0.7, label.y.npc = 0.1)) +
geom_segment(x = 0, y = 0, xend = 1, yend = 1, size = 0.3, linetype = "dashed") +
annotate(geom = "text", label = "y = x", x = 0.93, y = 1) +
annotate(geom = "text", label = paste0("Bliss < HSA: ", signif(100*task.con.excess.counts$less/sum(task.con.excess.counts), digits = 2), "%"), x = 0.15, y = 1) +
annotate(geom = "text", label = paste0("Bliss = HSA: ", signif(100*task.con.excess.counts$equal/sum(task.con.excess.counts), digits = 2), "%"), x = 0.15, y = 0.9) +
annotate(geom = "text", label = paste0("Bliss > HSA: ", signif(100*task.con.excess.counts$larger/sum(task.con.excess.counts), digits = 2), "%"), x = 0.15, y = 0.8)
Lastly, for the ConPropExcess task:
data.task.con.prop.excess = res.data %>%
filter(task == "ConPropExcess") %>%
group_by(ci_method) %>%
arrange(ci_method, consecutive_excess_count, excess_proportional_threshold, cell_line) %>%
transmute(consecutive_excess_count, excess_proportional_threshold, cell_line, f1.score) %>%
group_split() %>%
bind_cols()
# count Bliss vs HSA F1-Scores
task.con.prop.excess.counts = data.task.con.prop.excess %>%
transmute(equal = (f1.score == f1.score1), less = (f1.score < f1.score1), larger = (f1.score > f1.score1)) %>%
summarise(equal = sum(equal), less = sum(less), larger = sum(larger))
ggscatter(data = data.task.con.prop.excess, x = "f1.score", y = "f1.score1", color = "cell_line",
xlab = "Bliss F1-Scores", ylab = "HSA F1-Scores",
title = "'ConPropExcess' task (Kendall Correlation)",
add = "reg.line", add.params = list(color = "blue", fill = "lightgray"),
conf.int = TRUE, cor.coef = TRUE,
cor.coeff.args = list(method = "kendall", label.x.npc = 0.7, label.y.npc = 0.1)) +
geom_segment(x = 0, y = 0, xend = 0.8, yend = 0.8, size = 0.3, linetype = "dashed") +
annotate(geom = "text", label = "y = x", x = 0.73, y = 0.8) +
annotate(geom = "text", label = paste0("Bliss < HSA: ", signif(100*task.con.prop.excess.counts$less/sum(task.con.prop.excess.counts), digits = 2), "%"), x = 0.15, y = 0.8) +
annotate(geom = "text", label = paste0("Bliss = HSA: ", signif(100*task.con.prop.excess.counts$equal/sum(task.con.prop.excess.counts), digits = 2), "%"), x = 0.15, y = 0.7) +
annotate(geom = "text", label = paste0("Bliss > HSA: ", signif(100*task.con.prop.excess.counts$larger/sum(task.con.prop.excess.counts), digits = 2), "%"), x = 0.15, y = 0.6)
ConPropExcess task, less on the ConExcess task and even less on the Avg task. So the more sophisticated the method is, the larger the correlation between Bliss and HSA (more common F1-Scores).Avg and ConExcess tasks, HSA has a larger percentage of F1-Scores (\(\sim5\%\) and \(\sim 9\%\) difference respectively).ConPropExcess task, Bliss has a larger percentage of F1-Scores (\(\sim 13\%\) difference).ConPropExcess task. This also means that the value differences of the F1-Scores between HSA and Bliss are larger in magnitude in the ConPropExcess task compared to the other two tasks (since the difference percentages somewhat negate each other across the 3 tasks but there is a difference in the overall data in favor of Bliss).y=x line which means that there are higher Bliss values produced by some methods (no matter the task or cell line): these (outlier-sort-of) Bliss results outperform the HSA similar ones.Our job now is to find the task with the set of parameters that will give the best (HSA or Bliss-based) F1-score across all the cell lines tested - i.e. the task that best fits the curated GS across all cell lines.
So we will aggragate the results across the 8 cell lines by producing the average F1-Score for each specific synergy task with all its combination of parameters (for both Bliss and HSA).
We calculate the average F1-Scores per task and find the maximum ones across all cell lines:
# Avg task
avg.f1.task.avg = res.data %>%
filter(task == "Avg") %>%
group_by(ci_method, excess_threshold) %>%
summarise(avg.f1.score = mean(f1.score)) %>%
ungroup() %>%
add_column(task = "Avg", .before = 1)
data.list = list()
data.list[[1]] =
avg.f1.task.avg %>%
group_by(ci_method) %>%
filter(avg.f1.score == max(avg.f1.score)) %>%
ungroup()
# ConExcess task
avg.f1.task.con.excess = res.data %>%
filter(task == "ConExcess") %>%
group_by(ci_method, consecutive_excess_count, excess_threshold) %>%
summarise(avg.f1.score = mean(f1.score)) %>%
ungroup() %>%
add_column(task = "ConExcess", .before = 1)
data.list[[2]] =
avg.f1.task.con.excess %>%
group_by(ci_method) %>%
filter(avg.f1.score == max(avg.f1.score)) %>%
ungroup()
# ConPropExcess task
avg.f1.task.con.prop.excess = res.data %>%
filter(task == "ConPropExcess") %>%
group_by(ci_method, consecutive_excess_count, excess_proportional_threshold) %>%
summarise(avg.f1.score = mean(f1.score)) %>%
ungroup() %>%
add_column(task = "ConPropExcess", .before = 1)
data.list[[3]] =
avg.f1.task.con.prop.excess %>%
group_by(ci_method) %>%
filter(avg.f1.score == max(avg.f1.score)) %>%
ungroup()
best.avg.f1.scores = bind_rows(data.list) %>%
select(task, ci_method, avg.f1.score, excess_threshold, consecutive_excess_count, excess_proportional_threshold)
max.avg.f1.score = max(best.avg.f1.scores$avg.f1.score)
# color avg.f1.scores
f1.breaks = quantile(best.avg.f1.scores %>% pull(avg.f1.score), probs = seq(.05, .95, .05))
f1.colors = round(seq(255, 40, length.out = length(f1.breaks) + 1), 0) %>%
{paste0("rgb(255,", ., ",", ., ")")} # red
options(persistent = TRUE)
caption.title = "Highest average F1-Score Tasks with respective parameters"
datatable(data = best.avg.f1.scores,
options = list(dom = "t", # just show the table
order = list(list(3, 'desc')),
columnDefs = list(list(className = 'dt-center', targets = 3:6))),
caption = htmltools::tags$caption(caption.title, style="color:#dd4814; font-size: 18px")) %>%
formatStyle(columns = c("avg.f1.score"), backgroundColor = styleInterval(f1.breaks, f1.colors)) %>%
formatRound(3:6, digits = 3)
As we expected from the results of an earlier graph, the ConPropExcess task produces the lowest among the maximum average F1-Scores across all tasks tested (and it’s Bliss F1-Score is larger than the respective HSA).
The best average F1-Score across all 8 cell lines was produced by the ConExcess method. The rbbt command line parameters to produce the synergies per given cell line using the best task are:
rbbt workflow task SINTEF synergy_classification_by_consecutive_excesses --consecutive_excess_count=2 --excess_threshold=-0.17 --ci_method=hsa --cell_line=<NAME> | grep -v SF
best.task.data = res.data %>%
filter(task == "ConExcess", consecutive_excess_count == 2, excess_threshold == -0.17, ci_method == "hsa") %>%
select(cell_line, f1.score)
datatable(data = best.task.data, options = list(dom = "t"))
For the AGS cell line we have a perfect synergy assement: the best task results match exactly the same synergies as the curated Gold-Standard for AGS!
We now visualize the parameter space for each task and per ci_method (Bliss vs HSA):
# title font style
my.title.font = list(family = "Courier New, monospace", size = 18, color = "black")
plot_ly(data = avg.f1.task.avg, x = ~excess_threshold, y = ~avg.f1.score, width = 800,
type = "scatter", mode = "lines+markers", color = ~ci_method, colors = "Set1") %>%
layout(title = list(text = "Avg Task - Mean F1-Score across 8 cell lines", font = my.title.font),
xaxis = list(title = "Excess Threshold", zeroline = FALSE),
yaxis = list(title = "Mean F1-Score")) %>%
config(displayModeBar = FALSE)
# font style
my.colorbar.font = list(family = "Courier New, monospace", size = 14, color = "black")
fig1 = plot_ly(avg.f1.task.con.excess %>% filter(ci_method == "bliss"), x = ~consecutive_excess_count,
y = ~excess_threshold, z = ~avg.f1.score, zmin = 0, zmax = max.avg.f1.score, width = 1000,
colorscale = "Jet", type = "contour", contours = list(showlabels = TRUE)) %>%
colorbar(title = list(text = "Mean F1-Score \n(8 cell lines)", font = my.colorbar.font),
tickmode = "linear", tick0 = 0, dtick = 0.1) %>%
layout(annotations = list(
list(x = 0.5, y = 1, xanchor = "center", yanchor = "bottom", align = "center",
xref='paper', yref='paper', showarrow = FALSE,
font = my.title.font, text = "ConExcess Task - Bliss")))
fig2 = plot_ly(avg.f1.task.con.excess %>% filter(ci_method == "hsa"), x = ~consecutive_excess_count,
y = ~excess_threshold, z = ~avg.f1.score, zmin = 0, zmax = max.avg.f1.score, width = 1000,
colorscale = "Jet", type = "contour", contours = list(showlabels = TRUE),
showscale = FALSE) %>% layout(annotations = list(
list(x = 0.5, y = 1, xanchor = "center", yanchor = "bottom", align = "center",
xref='paper', yref='paper', showarrow = FALSE,
font = my.title.font, text = "ConExcess Task - HSA")))
subplot(fig1, fig2, titleX = TRUE, titleY = TRUE, margin = 0.05) %>%
config(displayModeBar = FALSE)
fig1 = plot_ly(avg.f1.task.con.prop.excess %>% filter(ci_method == "bliss"), x = ~consecutive_excess_count,
y = ~excess_proportional_threshold, z = ~avg.f1.score, zmin = 0, zmax = max.avg.f1.score, width = 1000,
colorscale = "Jet", type = "contour", contours = list(showlabels = TRUE)) %>%
colorbar(title = list(text = "Mean F1-Score \n(8 cell lines)", font = my.colorbar.font),
tickmode = "linear", tick0 = 0, dtick = 0.1) %>%
layout(annotations = list(
list(x = 0.5, y = 1, xanchor = "center", yanchor = "bottom", align = "center",
xref='paper', yref='paper', showarrow = FALSE,
font = my.title.font, text = "ConPropExcess Task - Bliss")))
fig2 = plot_ly(avg.f1.task.con.prop.excess %>% filter(ci_method == "hsa"), x = ~consecutive_excess_count,
y = ~excess_proportional_threshold, z = ~avg.f1.score, zmin = 0, zmax = max.avg.f1.score, width = 1000,
colorscale = "Jet", type = "contour", contours = list(showlabels = TRUE),
showscale = FALSE) %>% # no colorbar
layout(annotations = list(
list(x = 0.5, y = 1, xanchor = "center", yanchor = "bottom", align = "center",
xref='paper', yref='paper', showarrow = FALSE,
font = my.title.font, text = "ConPropExcess Task - HSA")))
subplot(fig1, fig2, titleX = TRUE, titleY = TRUE, margin = 0.05) %>%
config(displayModeBar = FALSE)
ConExcess method is the most verbose with regards to it’s parameter space: a larger amount of parameter combinations can be attributed to higher mean F1-Scores.Avg task, 2D for the other 2 tasks) parameter sub-space when comparing Bliss vs HSA - i.e. the synergy methods correlate in the parameter-space.The Bliss-all-combinations method (I arbitrarily chose the name) was the method that Barbara and Asmund used in (Flobak et al. 2019) on the data from the first screen to find and visualize the synergies across all 8 cell lines. The method is described in the text and Asmund wrote an R script that calculates the average Bliss excesses and significance scores (p_values_Bliss.R in the supplementary material - generates the bliss_significance.tsv file). Combinations with average Bliss excess values \(<−0.08\) and p-value \(<0.05\) (the Bonferroni-corrected) were classified as synergies (results were put into the bliss_synergies.tsv file in the SI of the paper).
The two threshold values above (average bliss excess and the corrected p-value) were more-or-less intuitively chosen (a good-enough p-value + an OK negative threshold so to speak). We now want to compare the synergies found with this method (across different combinations of the aforementioned parameters) with the GS synergies in each cell line and see which come closer to the curation-chosen ones.
We first load the data from the bliss_significance.tsv file and tidy ’em up a bit:
bliss.all.comb.data = drop_na(read_tsv(file = "bliss_significance.tsv"))
bliss.synergies = read_tsv(file = "bliss_synergies.tsv")
# Fix the cell line names
bliss.all.comb.data = bliss.all.comb.data %>%
mutate(cell_line = case_when(
cell_line == "SW-620" ~ "SW620",
cell_line == "COLO 205" ~ "Colo205",
cell_line == "DU-145" ~ "DU145",
cell_line == "SF-295" ~ "SF295",
cell_line == "UACC-62" ~ "UACC62",
TRUE ~ cell_line
))
bliss.synergies = bliss.synergies %>%
mutate(cell_line = case_when(
cell_line == "SW-620" ~ "SW620",
cell_line == "COLO 205" ~ "Colo205",
cell_line == "DU-145" ~ "DU145",
cell_line == "SF-295" ~ "SF295",
cell_line == "UACC-62" ~ "UACC62",
TRUE ~ cell_line
))
# Remove rows with SF drug in the combination
bliss.all.comb.data = bliss.all.comb.data %>% filter(!grepl('SF', drugcombo))
bliss.synergies = bliss.synergies %>% filter(!grepl('SF', drugcombo))
# Fix the drug names (add a dash between the drug names)
bliss.all.comb.data = bliss.all.comb.data %>%
mutate(drug_combo = str_replace(drugcombo, " ", "-")) %>%
select(cell_line, drug_combo, mean.bliss.estimate, p.value.bonferroni)
# below results are the same
# cell.line = "AGS"
# bliss.all.comb.data %>% filter(p.value.bonferroni < 0.05, mean.bliss.estimate < -0.08, cell_line == cell.line)
# bliss.synergies %>% filter(cell_line == cell.line, !grepl('SF', drugcombo))
Next, we chose the parameter space for the two thresholds of interest:
p_value_bonferroni from \(0.00001\) to \(0.05\) and the maximum one in the dataset (equal to \(1\))mean_bliss_threshold values from \(\sim -0.5\) to \(\sim +0.42\) with a step of \(\sim 0.0227\)We calculate for each combination of the above two parameters and for each cell line, the synergies found from this dataset and compare it with the GS synergies to derive an F1-Score for that parameter combination. Then we aggregate the results across the 8 cell lines per each parameter combination, to find the average F1-Score per (p_value_bonferroni, mean_bliss_threshold) combination:
p.values = c(0.00001, 0.00005, 0.0001, 0.0005, 0.001, 0.005, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1, 0.5, 1)
bliss.thresholds = seq(from = min(bliss.all.comb.data$mean.bliss.estimate), to = max(bliss.all.comb.data$mean.bliss.estimate), length.out = 42)
data.list = list()
index = 1
for (cell.line in cell.lines) {
gset = gs.list[[cell.line]]
for (p.value in p.values) {
for (threshold in bliss.thresholds) {
mset = bliss.all.comb.data %>%
filter(p.value.bonferroni <= p.value, mean.bliss.estimate <= threshold, cell_line == cell.line) %>%
pull(drug_combo)
data.list[[index]] = tibble(cell_line = cell.line, p_value_bonferroni = p.value, mean_bliss_threshold = threshold, f1.score = get.score(gset, mset))
index = index + 1
}
}
}
bliss.all.comb.res = bind_rows(data.list)
# Find avg F1-Score across all cell lines
avg.f1.bliss.all = bliss.all.comb.res %>%
group_by(p_value_bonferroni, mean_bliss_threshold) %>%
summarise(avg.f1.score = mean(f1.score)) %>%
ungroup()
caption.title = "Average F1-Score across 8 cell lines (Bliss-All-Combinations)"
datatable(data = avg.f1.bliss.all, options = list(
order = list(list(3, 'desc'))),
caption = htmltools::tags$caption(caption.title, style="color:#dd4814; font-size: 18px")) %>%
formatRound(2:3, digits = 3)
p_value_bonferroni \(<0.05\), so somewhat trustable :) that holds the 4th largest F1-Score (\(0.453\)), has a mean_bliss_threshold \(\le -0.096\). Thus the choice of Barbara and Asmund in (Flobak et al. 2019) for mean_bliss_threshold \(<−0.08\) and p_value_bonferroni \(<0.05\) to derive the synergies was surely one of the best choices they could make (call it intuition or eye-balling, this choice is now mathematically substantiated)!Visualizing the results of the above table using a contour plot we have:
plot_ly(data = avg.f1.bliss.all, x = ~-log10(p_value_bonferroni),
y = ~mean_bliss_threshold, z = ~avg.f1.score, zmin = 0, zmax = max.avg.f1.score, #width = 1000,
colorscale = "Jet", type = "contour", contours = list(showlabels = TRUE)) %>%
colorbar(title = list(text = "Mean F1-Score \n(8 cell lines)", font = my.colorbar.font),
tickmode = "linear", tick0 = 0, dtick = 0.1) %>%
layout(annotations = list(
list(x = 0.5, y = 1, xanchor = "center", yanchor = "bottom", align = "center",
xref='paper', yref='paper', showarrow = FALSE,
font = my.title.font, text = "Bliss-All-Combinations Method"))) %>%
config(displayModeBar = FALSE)
The larger the p_value_bonferroni (the values to the right of the above plot), the harder is to get a good value for the average F1-Score across all cell lines.
Flobak, Åsmund, Barbara Niederdorfer, Vu To Nakstad, Liv Thommesen, Geir Klinkenberg, and Astrid Lægreid. 2019. “A high-throughput drug combination screen of targeted small molecule inhibitors in cancer cell lines.” Scientific Data 6 (1): 237. https://doi.org/10.1038/s41597-019-0255-7.
Flobak, Åsmund, Miguel Vazquez, Astrid Lægreid, and Alfonso Valencia. 2017. “CImbinator: a web-based tool for drug synergy analysis in small- and large-scale datasets.” Edited by Jonathan Wren. Bioinformatics 33 (15): 2410–2. https://doi.org/10.1093/bioinformatics/btx161.
The Gold standard is actually taken from an online file that has the results from a manual/visual inspection of the SINTEF dataset (the dose-response curves) and it’s a consensus of three curators: Asmund Flobak, Barbara Niederdorfer and Evelina Folkesson. The synergy_classification_by_GS task from the rbbt SINTEF workflow has to be at least on the 2c5ea commit to take the latest version of that file↩
Computed from the bliss and hsa CombinationIndex task results and not taken directly from Barbara’s file any more as it was done with the observed_synergies_averaged task (i.e. on the SINTEF workflow commit d1a2e5 and before)↩
Also uses the bliss and hsa CombinationIndex task results↩
Also uses the bliss and hsa CombinationIndex task results↩